PubRunner: A light-weight framework for updating text mining

نویسندگان

  • Fabio Rinaldi
  • Julien Gobeill
چکیده

Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are underused because their results are static and do not reflect the constantly expanding knowledge in the field. In order for biomedical text mining to become an indispensable tool used by researchers, this problem must be addressed. To this end, we present PubRunner, a framework for regularly running text mining tools on the latest publications. PubRunner is lightweight, simple to use, and can be integrated with an existing text mining tool. The workflow involves downloading the latest abstracts from PubMed, executing a user-defined tool, pushing the resulting data to a public FTP, and publicizing the location of these results on the public PubRunner . This shows a proof of concept that we hope will encourage text mining website developers to build tools that truly will aid biologists in exploring the latest publications. This article is included in the Container collection. Virtualization in Bioinformatics 1 2 3 4

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PubRunner: A light-weight framework for updating text mining results

Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are...

متن کامل

PubRunner: A light-weight framework for updating text

Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are...

متن کامل

Efficient Updating of Discovered Patterns for Text Mining: A Survey

Text mining is the techniques of retrieving interesting information from the text document. Through the devising of patterns, we can retrieve high-quality information. There are many techniques for mining the useful patterns from the text document. Researchers are still going in efficient updating of discovered pattern. Polysemy and synonymy are the problem faced in term based approach. Phrase ...

متن کامل

A Sociolinguistic Scrutiny of the Great Gatsby and its Persian Translation in Light of Hatim and Mason’s Framework

Translation studies essentially deals with a socio-communicatively driven and contextualized enterprise. Viewed hence, it seems that no discipline tends to provide the possibility of studying the interrelations between interlocutors to generate meaning within the interactive social context as precisely as sociolinguistics (Federici, 2018). A sociolinguistic approach to translation seems to be i...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017